Compressed Pattern Matching for SEQUITUR
نویسندگان
چکیده
Sequitur due to Nevill-Manning and Witten. [18] is a powerful program to infer a phrase hierarchy from the input text, that also provides extremely effective compression of large quantities of semi-structured text [17]. In this paper, we address the problem of searching in Sequitur compressed text directly. We show a compressed pattern matching algorithm that finds a pattern in compressed text without explicit decompression. We show that our algorithm is approximately 1.27 times faster than a decompression followed by an ordinal search.
منابع مشابه
Collage system: a unifying framework for compressed pattern matching
We introduce a general framework which is suitable to capture the essence of compressed pattern matching according to various dictionary-based compressions. It is a formal system to represent a string by a pair of dictionary D and sequence S of phrases in D. The basic operations are concatenation, truncation, and repetition. We also propose a compressed pattern matching algorithm for the framew...
متن کاملPattern - Matching Problems for
The power of weighted nite automata to describe very complex images was widely studied, see [5, 6, 7]. Finite automata can be also used as an e ective tool for compression of twodimensional images. There are some software packages using this type of compression, see [12, 6]. We consider the complexity of some pattern-matching problems for two-dimensional images which are highly compressed using...
متن کاملThe Complexity of Two-dimensional Compressed Pattern Matching
We study computational complexity of two-dimensional compressed pattern matching problems. Among other things, we design an eecient randomized algorithm for the equality problem of two compressed two-dimensional patterns as well as prove computational hardness of the general two-dimensional compressed pattern matching .
متن کاملThe Complexity of Two - DimensionalCompressed Pattern -
We consider the complexity of problems for highly compressed 2-dimensional texts: compressed pattern-matching (when the pattern is not compressed and the text is compressed) and fully compressed pattern-matching (when also the pattern is compressed). First we consider 2-dimensional compression in terms of straight-line programs, see 9]. It is a natural way for representing very highly compresse...
متن کاملA New Compression Method for Compressed Matching
A practical adaptive compression algorithm based on LZSS is presented, which is especially constructed to solve the compressed pattern matching problem, i.e., pattern matching directly in a compressed text without decompressing.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001